Basic Statistics with Excel

 

1. Descriptive Statistics

 

Use Averages to summarize your data

 

Suppose you’ve done your experiment and you have data on 400 people. Are you going to make a graph with 400 sets of lines on it? Are you going to make a table with 400 rows of data?

Of course not! (HINT) You’re going to summarize your data by giving the AVERAGE values.

 

To practice, use the data file ‘BP Data’ which is on all the school computers in the BI 231 folder.   To get to this, click on the “Courses” folder on the desktop.  Then choose “BI231-net” and finally “BP Data.”  Choose the sheet labeled ‘Descriptive Statistics Practice,’ which presents systolic and diastolic blood pressure data for male and female children from 1-17 years of age.

(Blood pressure data adapted from US Department of Health & Human Services National Heart, Lung & Blood Institute’s “Fourth Report on the Diagnosis, Evaluation, and Treatment of High Blood Pressure in Children and Adolescents”)

 

What is the average systolic blood pressure for the male children?

 

1.      Find the column with the male children’s systolic blood pressure values.

2.      Click on the empty cell below the last number in the column

3.      Click on the “fx” button next to the bar at the top of the spreadsheet

4.      A prompt box will appear.

5.      Click on “AVERAGE”

6.       If it is not on the list, type  “average” in the search field and click “OK”

7.       A dialog box will appear, asking you what cells you want to average

8.       Double check to make sure they are the cells you want averaged, then click OK.

9.       The average will appear! You may forget what it is if you don’t label it, so click on the box just before it and type ‘average’ to remind yourself.

 

When you report an average in your paper, you should also report the number of people you took the average over. It is abbreviated ‘n’. You would write: “The average systolic blood pressure for male children was 115 mm Hg (n=17).” In your data table, you would have a column for the n value.

 

Sex

Number of Subjects

Average Systolic Blood Pressure (mmHg)

Male

17

115

Female

 

 


Now, find the average systolic blood pressure for the female children and fill in the rest of the table above.

 

What is the average systolic blood pressure for just 1 year old children?  This one is a little trickier, because you only want to average two numbers. To do it:

 

1.      Click on the empty cell at the end of the row containing the data (J2)

2.      Hit the “fx” button next to the bar at the top of the spreadsheet

3.      A prompt box will appear.

4.      Click on “AVERAGE”

5.       If it is not on the list, type  “average” in the search field and click “OK”

6.       A dialog box will appear, asking you what cells you want to average

7.       Double check to make sure they are the cells you want averaged, then click OK.

8.      If it is not on the list, click “more functions,” choose “statistical,” then “AVERAGE”

9.      A dialog box will appear, asking you what cells you want to average. It will give you the wrong cells, in this case! To select the two cells, click on the first (C2), hold down the “Ctrl” key and click on the second cell (H2)

10.  Once again, you may forget what the figure in this box means, so you might want to type ‘Average Systolic Blood Pressure by Age’ in the column heading.

 

You would express this in your paper by writing, “The average systolic blood pressure for one-year-olds was 99.5 mm Hg (n=2).”

 

Now calculate the average blood pressures for all ages. You can do this by clicking on the cell with the average figure for one-year-olds. A border will appear around the cell. Put the white cross cursor on the lower right corner of the border, it will turn black.  Hold the mouse button down, and drag downwards to fill the column. This will copy the equation you just created into all the other cells in the column.

 

Age

Number of Subjects

Average Systolic Blood Pressure (mmHg)

Average Diastolic Blood Pressure (mmHg)

 

Age

Number of Subjects

Average Systolic Blood Pressure (mmHg)

Average Diastolic Blood Pressure (mmHg)

1

2

99.5

 

 

10

 

 

 

2

 

 

 

 

11

 

 

 

3

 

 

 

 

12

 

 

 

4

 

 

 

 

13

 

 

 

5

 

 

 

 

14

 

 

 

6

 

 

 

 

15

 

 

 

7

 

 

 

 

16

 

 

 

8

 

 

 

 

17

 

 

 

9

 

 

 

 

 

 

 

 

 

Graph the Average Values to make your data presentation simpler.

 

Create a chart comparing the average systolic blood pressures of male to the average systolic blood pressure of female children.

 

1.      This chart will only contain two values. To select them, click on one value and then hold the ‘Ctrl’ key down and click on the second value.

2.      When you’ve selected both cells, click on the “Insert” tab and select the 2-D column graph.

3.      You will notice that the columns are labeled “1 & 2” when they should be labeled Male & Female.  To change this right click on the graph and choose “Select Data.”

4.      Click on “Edit” for the horizontal axis labels.

5.      You can then go to your spreadsheet, hold down the “Ctrl” key and click on “Male” and “Female” and click OK.

6.      To add a title and axis labels click on the ‘Design’ tab and choose  Chart Layouts.” Choose a layout that includes a title and a label for each axis.  Now you can edit the title until it says what you want it to.  When you title your graph, it should specifically state what can be concluded from your graph.

7.      You’ll see a big problem with this chart – it only shows blood pressure values starting at 112 mm Hg. Doesn’t it look as if boys and girls are very different? Charts that do this are regarded as dishonest, so you want to fix it right away. To do this, right click on the ‘112’ value on the chart.

8.      A box will appear, choose “Format Axis.”

9.      For the minimum, select “Fixed” and set at 0.  Click close.

 

Is the average good enough? Adding Standard Deviations

 

Which average is more accurate – the average value for male children’s blood pressure, or the average value for one-year-olds’ blood pressure? If you look at the data, you see that the average for male children is less accurate. Several of the children have blood pressures that are quite different from the average. So if you tell your readers the average value, you might be misleading them. How can you give them a better idea of the data, without making them read the whole data set?

 

You do this by giving not only the average but the STANDARD DEVIATION.

 

To calculate the standard deviation, follow the same set of steps you did for the average, but choose STDEV instead of AVERAGE.

 

Calculate the standard deviation for the systolic blood pressures of male children and for the systolic blood pressures of female children. Be careful to check which cells the computer is calculating the data for, so it doesn’t include your averages with the data.

 

You’ll find that the standard deviation for the male children’s blood pressure is higher than the standard deviation for female children’s blood pressure. That means you can trust the average for females more than the average for male children.


To present these data to your readers, you would write:

 

“The systolic blood pressure of male children from ages one to 17 averaged 115 mm Hg (n = 17; S.D. = 9.76mm Hg).”

 

or you might write:

 

“The systolic blood pressure of male children from ages one to 17 averaged 117 mm Hg +/- 9.76 (n=17).”

 

You should give the standard deviation any time you give an average. You do this by giving the value for the average, plus or minus the standard deviation.

 

Sex

Number of Subjects

Average Systolic Blood Pressure (mmHg)

Male

17

115 +/- 9.76

 

Calculate these standard deviations and express them in table and phrase form below:

 

Sample

Number of Subjects

Average Systolic Blood Pressure (mmHg)

3-year olds

 

 

females

 

 

8-year-olds

 

 

 

3-year-olds:

 

 

females:

 

 

8-year-olds:

 

 

 

You can also represent your standard deviation on your graph by adding error bars. 

1.      Under “Chart Tools, Layout” click on “Error Bars” under the “Analysis” tab.

2.      A few options will appear; select “More Error Bar Options…” at the bottom.

3.      On the pop-up box, choose “Custom” and click on “Specify Value.”

4.      The custom error bars box will appear; for the “Positive Error Value” go to the spreadsheet and select both standard deviations (you will need to hold the Ctrl key down to select both).  You will do the same thing for the “Negative Error Value.”

5.      Your graph will now contain the error bars and give a visual representation of the standard deviation.  You will notice that the two bars overlap; this suggests that the data is not statistically significant.

 

2. Comparative Statistics

 

Statistics aren’t just used to make your data presentation simpler. They’re used to tell whether the difference between two sets of data is significant or not. That is, was it a real difference or just due to chance?

 

To practice with these, go to the second sheet in the BP Data file – the sheet labeled ‘T-test practice data.’ The question you want to answer is, did treatment A increase blood pressure, or were these changes just due to chance?

 

Looking at the raw data, you see that many of the values increased, but some went down. What happens if you compare the average values? Calculate the average values and standard deviations for initial blood pressure and blood pressure after treatment A, and graph them. Don’t forget to label the averages on your spreadsheet, and to adjust the graph axis to start at 0.

 

It’s not clear whether there was a real change, is there? To find out, we’ll have to use some more sophisticated statistics and do a T-test.

 

A T-test compares two sets of values and gives you the p value -- the probability that they are just due to chance. Ideally, you want a very low probability, a p value less than 0.05. To do a T-test and compare these sets of blood pressures:

 

1.      Click on any empty cell

2.      Hit the “fx” button next to the bar at the top of the spreadsheet

3.      A prompt box will appear.

4.      Click on “TTEST”

5.       If it is not on the list, type  “ttest” in the search field and click “OK”

6.       A dialog box will appear. Click in the box next to “Array 1.”

7.      Drag the dialog box out of the way and highlight your first column of numbers.

8.      Click in the box next to “Array 2” and highlight your second column of numbers.

These are the two sets of data you want to compare.

 

To answer the ‘tails’ you need to look at the hypothesis.  If your hypothesis is specific and states that one value would be higher than the other, choose 1 tail and type the number 1 in the box. If you just predicted that they would be different, pick 2 tails and type the number 2 in the box. The hypothesis for the data was “Treatments A & B will both lower systolic blood pressure.”

 

To choose the ‘type’ of T test, you will need to determine who your subjects were.  If you used the same subject for both the initial trial and for Treatment A, you will need to choose a “paired” t-test and will type the number 1 in the box.  If you used a control group of subjects for the initial trial and an experimental group of subjects for Treatment B, you will need to choose an “unpaired” t-test and will type the number 2 in the box.   These data were before-and-after data for the same subjects, so use a paired t-test.

 

Now hit ‘OK’ and see what the number is. This is your p-value, the probability that the difference is just due to chance alone. A p-value of 0.05 or less is generally considered statistically significant, and one of 0.05 or greater indicates no real difference between the groups.

 

In your paper, you would express these data by writing: “Although the average systolic blood pressure increased by 8.2 mm Hg after treatment A, the difference was not statistically significant (p= 0.111).”

 

Now compare the initial systolic blood pressures with the systolic blood pressures after treatment B, using a T-test. Does treatment B cause a statistically significant change in blood pressure? How would you express this?

 

TURN IN:

1.      A printout of your graph comparing the average systolic blood pressures of male and female children.

2.      A sentence that tells what effect treatment B had on systolic blood pressure, and whether it was statistically significant or not.

3.      A graph of the average initial systolic blood pressure, the average blood pressure after treatment A and the average blood pressure after Treatment B.  You graph should include a specific title, labeled axes and error bars representing the standard deviations.